skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Peng, Aoran"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract In recent years, large language models (LLMs) and vision language models (VLMs) have excelled at tasks requiring human-like reasoning, inspiring researchers in engineering design to use language models (LMs) as surrogate evaluators of design concepts. But do these models actually evaluate designs like humans? While recent work has shown that LM evaluations sometimes fall within human variance on Likert-scale grading tasks, those tasks often obscure the reasoning and biases behind the scores. To address this limitation, we compare LM word embeddings (trained to capture semantic similarity) with human-rated similarity embeddings derived from triplet comparisons (β€œis A closer to B than C?”) on a dataset of design sketches and descriptions. We assess alignment via local tripletwise similarity and embedding distances, allowing for deeper insights than raw Likert-scale scores provide. We also explore whether describing the designs to LMs through text or images improves alignment with human judgments. Our findings suggest that text alone may not fully capture the nuances humans key into, yet text-based embeddings outperform their multimodal counterparts on satisfying local triplets. On the basis of these insights, we offer recommendations for effectively integrating LMs into design evaluation tasks. 
    more » « less
    Free, publicly-accessible full text available October 1, 2026
  2. Abstract Well-studied techniques that enhance diversity in early design concept generation require effective metrics for evaluating human-perceived similarity between ideas. Recent work suggests collecting triplet comparisons between designs directly from human raters and using those triplets to form an embedding where similarity is expressed as a Euclidean distance. While effective at modeling human-perceived similarity judgments, these methods are expensive and require a large number of triplets to be hand-labeled. However, what if there were a way to use AI to replicate the human similarity judgments captured in triplet embedding methods? In this paper, we explore the potential for pretrained Large Language Models (LLMs) to be used in this context. Using a dataset of crowdsourced text descriptions written about engineering design sketches, we generate LLM embeddings and compare them to an embedding created from human-provided triplets of those same sketches. From these embeddings, we can use Euclidean distances to describe areas where human perception and LLM perception disagree regarding design similarity. We then implement this same procedure but with descriptions written from a template that attempts to isolate a particular modality of a design (i.e., functions, behaviors, structures). By comparing the templated description embeddings to both the triplet-generated and pre-template LLM embeddings, we identify ways of describing designs such that LLM and human similarity perception better agree. We use these results to better understand how humans and LLMs interpret similarity in engineering designs. 
    more » « less
  3. null (Ed.)
    High globalization in the world today results in the involvement of multi-discipline, multi-cultural teams, as well as the entrance of more economic powers in the market. Effective innovation strategies are critical if emerging markets plan to become economic players in this increasingly connected global market. The current work compares the design processes of designers from emerging and established markets to understand how design methods are applied across culture. Specifically, the design decisions of designers from Morocco, one of the four leading economic power in Africa, and the U.S. are investigated. Concept generation and selection are the focus of the current study as they are critical steps in the design process that can determine project outcomes. Previous studies have identified three factors, ownership bias, gender, and idea goodness as influential during concept selection. The effect of these three factors on designers in the United States is well established. The current study expands upon previous findings to examine the influence of these factors across two cultures β€” U.S. and Morocco. The results of this study, although preliminary, found that U.S. students had a higher idea fluency than Morocco students. It also found a significant difference in idea fluency between genders in the U.S. but not in Morocco. In addition, it was found that overall, participants exhibited ownership bias toward ideas with high goodness. 
    more » « less